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PROGRAMMABLE PACKET HEADER PROCESSSOR 

The present invention relates to the field of microprocessors. More 
specifically, the present invention relates to a microprocessor designed to 
read, process and reformat data created within the header portion of a data 
packet. 

Presently, most data communication networks rely on packet 
transmission of data, such as the IP protocol defined by Internet Engineering 
Task Force (IETF) rfc791. Network nodes are required to check, process and 
route the control information contained in each packet, usually in the leading 
header fields, in accordance with the supported protocol definitions. 

Current methods of obtaining higher performance and throughput 
include using hardware logic to implement these functions. However, this 
has the drawback that new hardware has to be installed if the current protocol 
definitions change or if new protocols are to be supported. 

The present invention overcomes these drawback by using a 
Programmable Header Translator that employs parallel processing logic 
blocks to achieve high throughput and downloaded microcode to give 
flexibility for changing functionality. 

According to the present invention there is provided a programmable 
packet header processor apparatus arranged to read, process and reformat 
fields within the header of a data packet, wherein said processor comprises a 
Programmable Header Translator device which employs a plurality of parallel 
processing logic blocks which operate with a downloaded microcode to 
flexibly reconfigure processing algorithms. 
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The processor may further comprise an address look-up engine which 
functions to facilitate the read, process and reformat of the fields within the 
header of the data pocket. Furthermore, the processing algorithms may be 
obtained from the downloadable microcode which is contained in an external 



source. 



Said Programmable Header Translator is intended to operate closely 
with an address look-up engine to form a packet header switching engine. 
.The.mainj-equirements of said packet header switching engine are: 

- read in a programmable number of packet bytes and port address if required 
and a length field, referred to as the Unresolved Header Record, 

- extract appropriate fields from the Unresolved Header Record, depending 
on the supported protocols, 

- perform exact match or longest best match searches on the extracted 
information via said address look-up engine, 

- maintain a high header processing throughput, 

- allow offline update of programming information, 

- run the field processing algorithms from downloadable microcode using an 
instruction set, and 

- return pre-programmed information, including modified header, output 
queue control and status information, referred to as the Resolved Header 
Record. 

The instruction set referred to above includes: 

- extracting any contiguous bit string from the header as indexed by bit 
positions, 

- forming a concatenated set of fields to present to said address look-up 
engine, 
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- comparing (<, =, > etc.) registers with other register or code constants, 

- branching to allow multiple paths through the microcode, 

- performing a longest match algorithm via said address look-up engine 
operating in a best longest match mode, 

- performing addition and subtraction operations, 

- bit shifting, 

- performing checksums over a range of packet header data, and 

- performing an exact match algorithm via said address look-up engine 
operating in an exact match mode. 

The processing algorithms are run from downloadable microcode to # 
allow simple modification of the processing algorithms. 

By using a small set of highly optimised instructions located in tightly 
coupled RAM it may be possible for the Programmable Header Translator, in 
conjunction with the address look-up engine, to perform look-up on virtually 
any combination of layer 2, 3, and 4 protocol fields. 

While the principle advantages and features of, the invention have been 
described above, a greater understanding and appreciation of the invention 
may be obtained by referring to the following drawing and detailed 
description of a preferred embodiment, presented by way of example only, in 
which; 

Figure 1 shows the functional block structure of the Programmable 
Header Translator according to a first aspect of the present invention, and 

Figure 2 shows the functional block structure of the Programmable 
Header Translator according to a second aspect of the present invention. 

In figure 1 , the Programmable Header Translator 1 is shown. 
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From start-up, microcode is downloaded from an external source 2 and 
written firstly to internal RAM 6 disposed proximate the Header Processor 
Selector 8 and then to a plurality of parallel processing logic blocks, referred 
to in this preferred embodiment as Header Processing Engines 10. The 
external source 2 may be RAM/ROM or a microprocessor. Alternatively, the 
microcode is downloaded from external source 2 and written directly to the 
Header Processing Engines 10. The number of Header Processing Engines 
may vary according to the specific requirements of the microprocessor. In 
this preferred embodiment three Header Processing Engines 10a, 10b, and 
10c are shown. 

Configuration data is also downloaded from an external source 2, 
which may be RAM/ROM or a microprocessor, and is written to various 
configuration registers 4. The Programmable Header Translator 1 is then 
enabled and ready for processing. 

An external system (not shown) extracts a programmable number of 
octets from the front of a data packet, ensuring that enough data has been 
extracted to contain all relevant header fields. This can be concatenated with 
level 2 fields if required, for example with VPI/VCI for ATM networks, and 
then stored in the Unresolved Header Record 20 along with the packet length 
and various control information, such as a local packet identifier. The 
Unresolved Header Record 20 is then transferred to the Programmable 
Header Translator 1. A Header Buffer 22 provides a buffer to absorb 
temporary peaks of traffic that exceed the sustained processing rate of the 
Programmable Header Translator 1 and also store the Unresolved Header 
Record 20 until a Header Processing Engine 10a, 10b or 10c becomes 
available. 
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As each Unresolved Header Record 20 is removed from the Header 
Buffer 22 it is loaded into a Header Processing Engine (HPE) 10 by the 
Header Processor Selector 8, This step may be controlled by a simple f next 
free HPE 1 algorithm or alternatively, there may be a more intelligent selection 
method which pre-processes the Unresolved Header Record and directs 
known protocols to certain Header Processing Engines. For this feature, 
specific microcode may be downloaded to the Header Processor Selector 8 
from the external source 2. The Header Processor Selector 8 may 
communicate directly with the Header Output Selection 30, via connector 25, 
to maintain data packet sequence integrity. 

The Header Processing Engines 10 then performs the necessary 
protocol dependent algorithms from the microcode stored in an internal ^ 
RAM. A number of configurations are possible, depending upon 
performance and silicon utilisation tradeoffs. 

In a first configuration, as shown in figure 1, each Header Processing 
Engine 10a, 10b, 10c has its own integral internal RAM 32a, 32b, 32c. A 
separate copy of the microcode is held in the internal RAM 32a, 32b, 32c 
dedicated for that engine. This is the most flexible arrangement allowing for 
each different Header Processing Engine to perform different functions and 
also gives the highest throughput since no RAM contention will exist. 

In a second configuration, as shown in figure 2 with parts also 
appearing in figure 1 bearing identical designation, all Header Processing 
Engines 10a, 10b, 10c access a single microcode held in an internal RAM 32. 
In this configuration the internal RAM 32 is located external to the Header 
Processing Engines 10a, 10b, 10c, but is still internal to the Programmable 
Header Translator 1 . Thus, in this second configuration all Header 
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Processing Engines run the same microcode and suffer from memory access 
hold Off due to contention. 

In both of the above mentioned configurations, each Header Processor 
Engine is equipped with its own set of registers and comparators (not shown). 
In addition, each Header Processor Engine may have access to a shared 
external RAM 40 and an address look-up engine 34. 

The shared external RAM 40 is necessary if common state information, 
such as packet^ sequence numbering for IETF L2TP is required_by_all Header 
Processor Engines. In this case, the shared external RAM 40 requires access 
scheduling and semaphore mechanisms to prevent stored data from getting 
out of date. In an alternative embodiment, the shared external RAM 40 may 
be implemented internal to the Programmable Header Translator 1. 

Following processing, the Header Processor Engines return a Resolved 
Header Record 45 to an external system (not shown) via a Header Output 
Selector 30. In an alternative embodiment, the Header Output Selector 30 
may select the Header Processor Engines 10 in the same order as the Header 
Processor Selector 8. In yet a further embodiment, the Header Processor 
Engines 10 send information directly to an external processor (not shown) to 
provided a 'flow detection' feature. 

As will be appreciated by those skilled in the art, various modifications 
may be made to the embodiment hereinbefore described without departing 
from the scope of the present invention. 



0010302A1 l > 



i 



WO 00/1 0302 PCT/GB99/02554 

7 

CLAIMS 

1 . A programmable packet header processor apparatus arranged to read, 
process and reformat fields within a header of a data packet, wherein said 
processor comprises a Programmable Header Translator device which 
employs a plurality of parallel processing logic blocks which operate with a 
downloaded microcode to flexibly reconfigure processing algorithms. 

2. Processor apparatus as claimed in Claim 1, wherein said processor 
further comprises an address look-up engine which functions to facilitate said 
read, process, and reformat of said fields within said header of said data 
packet. 

3. Processor apparatus as claimed in Claims 1 or 2, wherein said - ? j 
processing algorithms are obtained from said downloadable microcode 

contained in an external source, ■ ■ . \ 

4. Processor apparatus as claimed in Claim 3, wherein said external ^ j 
source is a RAM/ROM. 

5. Processor apparatus as claimed in Claim 3, wherein said external 
sources is a microprocessor. 

6. Processor apparatus as claimed in any of the preceding Claims, 
wherein said downloadable microcode comprises configuration data, which is 
downloaded to at least one configuration register for establishing a plurality 
of processing configurations. 

7. Processor apparatus as claimed in Claim 6, wherein said apparatus 
further comprises a header buffer means for storage of unresolved header 
records, and a header processor selector means which facilitates 
communication of said unresolved header record stored in said header buffer 



8NSDOCID: <WO 0010302A1J_> 



WO 00/10302 



PCT/GB99/02554 



8 

to said plurality of parallel processing logic blocks, whereby said plurality of 
parallel processing logic blocks operate in conjunction with said 
downloadable microcode to resolve said unresolved header record. 

8. Processor apparatus as claimed in Claim 7, wherein said downloadable 
microcode is stored in an internal RAM. 

9. Processor apparatus as claimed in Claim 8, wherein a first of said 
plurality of processing configurations operates such that each of said plurality 
of parallel processing Jqgic blocks has as a separate and dedicated copy of 
said downloadable microcode. 

10. Processor apparatus as claimed in Claim 8, wherein a second of said, 
plurality of processing configurations operates such that all of said plurality 
of parallel processing logic blocks access a single copy of . said downloadable 
microcode. 

1 1 . Processor apparatus as claimed in any of the preceding Claims wherein 
each of said plurality of parallel processing logic blocks comprises at least - 
registers and comparitors. 

12. Processing apparatus as claimed in any of the preceding Claims, 
wherein said plurality of parallel processing logic blocks have access to a 
shared RAM for accessing common information. 

13. Processing apparatus as claimed in Claim 12, wherein said shared 
RAM is disposed external to said processing apparatus. 

14. Processing apparatus as claimed in Claim 12, wherein said shared 
RAM is disposed internal to said processing apparatus. 

15. Processing apparatus as claimed in any of the preceding Claims, 
wherein said plurality of parallel processing logic blocks communicate said 
resolved header record directly to an external system. 
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16. Processing apparatus as claimed in any of the preceding Claims 1 to 
14, wherein said apparatus further comprises a header output selector which 
operates to facilitate communication of said resolved header record to an 
external system. 

17. Processing apparatus as claimed in any of the preceding claims, 
wherein said plurality of parallel processing logic blocks are Header 
Processing Engines. 

18. Processing apparatus as here and before described with reference to the 
accompanying drawings. 
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